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1. INTRODUCTION 

Sleep is a part of someone’s activity that has an important role in their daily life. Lack of sleep could 
influence someone’s health [1]. There are some diseases and inflammatory conditions related to sleep disorders 
namely heart disease, stroke, diabetes, and depression [2], [3]. Analysis of sleep stage classification is often 
used for sleep quality assessment [4], and for identifying sleep disorders to prevent chronic diseases due to 
sleep disorders [5]. Sleep stage classification as one of the sleep analysis techniques has been conducted to 
score the sleep stages of patients in obstructive sleep apnea [6]. However, the selection of the proper 
electrocardiogram (ECG) features is still considered challenging and becomes an issue to achieve the 
performance of the algorithm used. Therefore, it is necessary to investigate which ECG features are very 
significant to the performance of the algorithm. In addition, the method used needs to be optimized to reach 
the maximum agreement rate. 

Some conventional machine learning techniques have been used to classify sleep stages from 
electroencephalogram (EEG), electromyogram (EMG), and electrooculography (EOG) [7]-[11]. Hanaoka 
et al. [12] conducted automatic sleep scoring using decision tree learning by generating the tree to classify the 
signal data according to its classes. Khushaba et al. [13] conducted sleep stage scoring using orthogonal- 
locality sensitive fuzzy discriminant analysis using the EEG, EMG, and EOG signals. The identification of 
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sleep stages has been done for EEG signal using multi classes support vector machine (SVM) [14]. Moeynoi 
and Kitjaidure [15] used dimension reduction based on canonical correlation analysis (CCA) to classify sleep 
stages using EEG and ECG signals. Sleep stages classification could be derived not only from EEG, ECG but 
also could be derived from a combination of ECG-derived Heart rate variability (HRV) and respiratory [16]. 
In recent, deep learning approaches have been used to classify sleep stages. Vilamala et al. [17] designed deep 
convolutional neural network (DCNN) architecture to interpret sleep stages from EEG signals. Similarly, 
Supratak et al. [18] classified sleep stages from raw single-channel EEG data using DeepSleepNet architecture. 
DCNN also has been used to differentiate awake-state from sleep-state using actigraph data [19]. Even though 
deep learning approaches showed better performances, but it is hard of getting the most significant ECG 
features to be recommended. 

In this study, we presented conventional machine learning of SVM to classify binary sleep stages from 
three-channel Holter ECG (V5, CC5, and V5R) of ten subjects referred to the sleep disorders 
clinic [20]. To improve the performance of classification, the optimization method of grid search has been 
applied to find the best parameters for the SVM method. In addition, we investigated the most significant ECG 
features with respect to the performance of the algorithm using feature selection of information gain. The 
remainder of this paper is organized into four sections. Section 2 describes the methods used for the 
classification of sleep stages. Section 3 presents the results of classification performances using SVM and 
optimized SVM. Finally, all works from this paper are concluded in section 4. 


2. METHOD 
2.1. Segmentation and filtering 

The overall process of sleep stages classification in this experiment is shown in Figure 1. The recorded 
ECG signal was taken from St. Vincent’s University Hospital sleep apnea dataset. In this implementation, the 
sleep states were differentiated based on the length of segmented data. The ECG signal from each subject was 
segmented using 30-second epochs and a sampling frequency of 100 Hz producing 3,000 data samples. Each 
segment data is then filtered using a finite impulse response (FIR) filter at a band frequency of 0.05-35 Hz. 


Segmentation Filtering 
30 seconds Butterworth 
100 Hz 0.05 — 35 Hz 


Feature Sleep Performance 


Extraction Estimation Check 
Optimized 
SVM-Grid 
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Figure 1. Block diagram of sleep states classification using three-channel Hotler ECGs (V5, CC5, V5R). 
Benchmark test is conducted using SVM and optimized SVM using grid search method 


2.2. Feature extraction 

After filtering each segment data, the features from filtered data were then extracted using mean, 
variance, and standard deviation. The mean of filtered data calculates the average of the data in one segment. 
It is calculated using (1): 


= n NaO (1) 


where N indicates the length of one segment data and x(i) each sample data of segment. 

The variation of each segment data was also calculated as a standard deviation of the data. We 
inspected the variation of each segment data to distinguish sleep stages using (2) and the variance of segment 
data using (3) as shown in: 


Std(x) = |F5ELiG@ - a? (2) 
Var(x) = — EL (i) - x)? (3) 


where x(i) the i-th sample data and x the average of data. 
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Figure 2(a) describes 1000 samples of ECG signals characterizing the awake state while Figure 2(b) 
demonstrates ECG signals for the sleep state. Obviously, we may differentiate the awake state from the sleep 
state by observing the patterns of the ECG signals. 
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Figure 2. ECG signals recorded using a Reynolds Lifecard CF system (a) 1000 samples ECG signal of awake 
state and (b) 1000 samples ECG signal of sleep state 


2.3. Support vector machine 

SVM is one of the most popular machine learning techniques. The classifier SVM separates the classes 
by seeking the hyperplane that maximizes the distance between the different classes [21]. There are some 
kernels that can be used to tackle the input data grouped into linear and nonlinear kernels. In terms of nonlinear 
processing, the kernel trick is often applied for the SVM to find the best model separation for the classes [22]. 

In this paper, we utilized grid search as an optimization method for the SVM to automatically find the 
best parameters of the SVM improving the classification performances. We set up the initial parameters and 
then defined the sample spaces of the SVM parameters. In Algorithm 1 shows the pseudocode of the optimized 
SVM using grid search. 


Algorithm 1 An optimized SVM using grid search pseudocode: 
1: create SVM model as an initial classifier 
2: initial model < train the initial classifier (train data, train label) 
3: define sample spaces of the SVM parameters candidate: 
4: °C’ e [0.5, 1.0, 1.5, 2.0, 2.5] 
5: *kernel’ + [’linear’,’rbf’,’poly’,’sigmoid’] 
6: degree’ < [2, 3, 4, 5, 6] 
J ’ gamma’ < [’scale’,’auto’] 
8: grid search model < create model (initial model, sample spaces) 
9: optimized parameters + train grid search model (train data, train label) 
10: Prediction < grid search model optimized parameters (test data) 
11: Accuracy < accuracy score (test label, prediction) 
12: Precision < precision score (test label, prediction) 
13: Recall + recall score (test label, prediction) 
14: return Accuracy; precision; recall 


2.4. Metric evaluation 

The performance of the optimized model in classifying sleep stages was evaluated using three metrics 
evaluation namely accuracy, precision, and recall. The accuracy performance that represents the accumulation 
of true positive (TP) and true negative (TN) of certain class prediction and the actual class divided by a total 
number of predictions as formulated in (4): 


TP+TN 


Accuracy = ——————— 
y TP+TN+FN+FP 


(4) 
False negative (FN) represents a number of incorrectly predicted sleep stages for a certain class label, and false 
positive (FP) shows a total number of predictions when the classifier falsely predicts sleep stage. In (5), 
precision represents the rate of TP with respect to the number prediction from the certain sleep states to all 
actual sleep states. 
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ose TP 
Precision = —— (5) 
TP+FP 


Recall describes ratio of TP to all number prediction class of a certain actual sleep states. The recall is 
formulated as (6): 


TP 
TP+FN 


Recall = (6) 
In this study, we conducted classification of sleep stages into two classes namely awake and sleep states. The 
class of sleep stages was derived from ECG signals. 


3. RESULTS 

In this section, we addressed some findings based on simulation. The classification performances were 
investigated based on three scenarios. Firstly, the performance of SVM was investigated using three statistical 
features namely mean, variance, and standard deviation. The performance of SVM and optimized SVM were 
then analyzed. Secondly, in order to validate the performance of the algorithm, seven additional features were 
added to the first scenario. The seven additional features consist of cress-factor, kurtosis, energy, skewness, 
spectral frequency, entropy, and zero-crossing, as defined Sunarya et al. [23]. In the last scenario, feature 
selection of information gain was utilized to find the three best significant features then the features were 
applied to the algorithm. 

Figure 3(a) illustrates the accuracies of ten subjects using the original SVM and the optimized SVM. 
There are significant improvements after applying the optimization method of grid search. The grid search has 
proven to lif up the accuracy score of each subject during implementation with the average accuracy of 84.03% 
this accuracy score is 6.30% higher than the original SVM (77.73%). The highest score, 90.56% was obtained 
by subject 10 using the optimized SVM while the lowest one by the subject 5, 68.73% in original SVM and 
74.18% in optimized SVM. Overall, the optimized SVM was superior in terms of accuracy performance 
compared to the original SVM. 

Figure 3(b) shows the classification performance in terms of precision score. This performance 
represents the total number of prediction awake states or sleep states with respect to all actual sleep stages. 
Similar to the accuracy performances, the average precision using optimized SVM (82.69%) shows superior 
results than the original SVM (77.82%). Both the optimized and original SVM show the lowest result precision 
score on subject 3, 58.20% and 55.54% respectively. Based on those results, we could analyze that optimizing 
SVM parameters has successfully improved the performance of the SVM method. 

Figure 3(c) shows the average of recall in each subject. In this case, recall reflects the number of 
predictions that the classifier correctly predicts the awake and sleep states based on its actual class. Overall, 
the recall performance using the optimized SVM outperformed compared to the original SVM with the average 
recall score of 84.03% and this score is 6.67% higher than the original SVM (77.36%). Similar to the accuracy 
and precision performances, subjects 3, 5, and 9 showed a decreasing trend with the lowest one was subject 5 
(68.73%). The highest recall score was obtained by subject 10 (90.56%). 

The classification performance of the algorithm was then investigated using 10 features, as shown in 
Figure 4. The 10 features consist of three statistical features on the first scenario and seven other additional 
features. Figure 4(a) shows the average accuracy performance on training and testing data both using optimized 
SVM. Subject 10 reached the highest accuracy score of 93.60% on training and 92.72% on test data, 
respectively. Meanwhile, the lowest accuracy was happened to subject 9 on training data and subject 1 on 
testing data. Figure 4(b) shows the performance of the algorithm in terms of precision performance. Among 
all the subjects, subject 10 reached the highest precision score of 91.28% while subject 9 is the lowest score of 
77.72%. Figure 4(c) describes the recall performance. Likewise in the accuracy and precision performances, 
subject 10 reached the highest score of 92.72% in terms of recall performance. 

Compared to the first scenario, the scenario using 10 features relatively has higher performances. In 
the case of the average accuracy score, the second scenario reached 85.46%. This score is 1.43% higher than 
the average accuracy of the first scenario with grid search algorithm (84.03%) and 7.73% higher than the first 
scenario without grid search (77.73%) respectively. Meanwhile, the average precision (84.05%) and recall 
(85.44%) in the scenario using 10 features are still superior compared to the scenario using three statistical 
features as in the first scenario 82.69% and 84.03% respectively. 
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Figure 3. The classification performances across 10 subjects using the SVM and optimized SVM based on 
three statistical features (a) average accuracy, (b) average precision, and (c) average recall 


In the last scenario, the 10 ECG features used in the second scenario were investigated to gain which 
features have the most significant role in the performance of the algorithm. Then the three best features were 
selected as the features used in the algorithm confirming the classification performance of the algorithm. 
Feature selection of the information gain (IG) has been used to select and order the features based on the 
significant rate of the features. The higher the information gain, the more significant the role of features in the 
performance algorithm. Figure 5 describes the role of the features to the performance algorithm from the most 
significant to the fewer ones. 

Based on the information gain shown in Figure 5, the sorted features are crest factor, standard 
deviation, kurtosis, energy, mean, skewness, variance, spectral frequency, entropy, and zero-crossing. Then the 
three best features of crest factor, standard deviation, and kurtosis were applied to the algorithm resulting the 
average accuracy of 84.83% the average precision of 84.83% and the average recall of 76.17% respectively. 
Table 1 shows performance classification results from all scenarios in the implementation starting from the 
SVM using three statistical features as the reference point analysis. The optimization method of grid search 
was applied to the first scenario resulting significant performances improvement not only in the accuracy but 
also in the precision and recall. In order to investigate the effect of features then seven additional features were 

added to the algorithm as a result the performances increased. Based on the information gain score then we 
picked up the three best features. The results show that the average accuracy performances from the selected 
features are superior to that reference scenario. It noticed that both the optimization algorithm of grid search 
and feature selection of information gain took a significant role in improving the performance classification. 
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Figure 4. The classification performances across 10 subjects utilizing 10 features (a) average accuracy, 
(b) average precision, and (c) average recall 
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Figure 5. The information gain of 10 features corresponding to performance algorithm 


4. DISCUSSION 


In this section, the accuracy performances among all scenarios were addressed. The accuracy 
performance was investigated in each scenario. Figure 6(a) describes the comparison of average accuracy 
between the SVM using three statistical features with and without optimization algorithm of grid search across 
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ten subjects. The curve without grid search was indicated with label ‘acc_3fis’ while the curve with grid search 
was indicated with label ‘acc_3fts opt’. The results show that the average accuracy performance of the method 
with grid search is higher than the method without grid search. Three subjects i.e., subjects 3, 5 and 9 show 
significantly lower than the rest of subjects. This phenomenon might happen due to subject conditions. Since 
all the subjects included in this implementation have high potential sleep disorders that might affect ECG 
signals while recording. Figure 6(b) describes the average accuracy algorithm when seven additional features 
were added into the algorithm in the reference scenario. It can be seen that the performance algorithm using 
ten features is relatively higher than only using three statistical features. Moreover, in the case of applying the 
grid search. In the scenario of using ten features, the average accuracy increases after using the optimization 
algorithm of grid search ‘acc_10fts opt’ compared to the SVM without the grid search ‘acc 10fts’. Three out 
of ten features were selected using the information gain method to gain the best features for ECG sleep 
classification. Among ten features, the crest factor, standard deviation, and kurtosis were selected as the three 
best features. Based on these three selected features the average accuracy scores could be compared to that 
using the same number of features in the reference scenario. Figure 6(c) describes the average accuracy using 
three selected features. The results show that the SVM with three selected features outperforms the SVM with 
three statistical features at the same number of features. Furthermore, the SVM using three selected features 


with grid search algorithm ‘acc _3ftsAdds opt’ was superior to the SVM without grid search algorithm ‘acc 
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Figure 6. The accuracy performances with and without optimization across ten subjects (a) accuracy using 3 
statistical features, (b) accuracy using 10 statistical features, and (c) accuracy using3 selected features 


Figure 7 illustrates the overall classification performances of three scenarios using the optimization 
algorithm of grid search across ten subjects. Figure 7(a) shows the average accuracy of three optimized 


Electrocardiogram feature selection and performance improvement of sleep stages ... (Lyra Vega Ugi) 


2040 O ISSN: 2302-9285 


algorithms. The optimized SVM using ten features ‘acc_10fts’ relatively higher compared to the optimized 
SVM using the three selected features ‘acc_3fts+’ or three statistical features ‘acc_3fts’. Similarly, the average 
precision ‘pre_10fts’ and recall ‘rec_10fts’ of the optimized SVM using ten features were superior compared 
to the precision and recall using 3 statistical features and selected features as shown in Figure 7(b) and Figure 
7(c), respectively. Overall, the optimized SVM using ten features outperformed among other scenarios. 
Moreover, the optimized SVM using the three selected features was superior compared to the scenario using 
three statistical features and slightly inferior to the scenario using ten features. Based on these results, it could 
be judged that grid search and feature selection of information gain has an important role in lifting the 


classification performances. 
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Figure 7. Classification performances of optimized methods across ten subjects; (a) accuracy of all scenarios, 
(b) precision of all scenarios, and (c) recall of all scenarios 


Table 2 shows benchmark results of the proposed method with related works. In 2020, sleep stages 
classification has been conducted using decision tree [24]. The study utilized Poincare plot and DFA as its 
features yielding 60% of accuracy. linear discriminant analysis (LDA) has been used to classify sleep stages 
with the average accuracy of 71.93% in [25]. Mehdi et.al. used eight frequency domain features, three statistical 
time-domain features, and two nonlinear features in their study. The SVM methods using HRV time and 
frequency domain features were used to classify sleep stages resulting in an average accuracy of 79.07% and 
76% in [16], [26] respectively. Our proposed method used the SVM optimized with grid search resulting 
average accuracy of 85.46% using ten features and 84.03% using three statistical features. Furthermore, we 
introduced feature selection of information gain to select the most significant ECG features. The three 
significant features were selected and then applied to our method resulting average accuracy of 84.83%. The 
accuracy using three selected features was superior compared to other methods for the same number of features 
as can be seen in Table 2. Overall, our method outperformed other related works. 
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Table 2. Benchmark the optimization SVM using grid search with the related works 


Method Year Feature Accuracy (%) 

Decission Tree [24] 2020 Poincare plot and DFA 60 

LDA [25] 2018 8 frequency domain, 3 statistical time domain and 2 nonlinear features 71.93 

SVM-HRV [26] 2012 HRV frequency and time domain features 79.07 
SVM-PCA [16] 2019 HRV frequency and time domain features, cross-spectra and magnitude 76 

squared coherence features 

FCM [28] 2012 55 time and frequency features 80.62 

SVM + grid search 2022 mean, standard deviation, and variance 84.03 

SVM + grid search 2022 crest factor, standard deviation, and kurtosis 84.83 

SVM + grid search 2022 crest factor, standard deviation, kurtosis, energy, mean, skewness, variance, 85.46 


spectral flux, entropy, and zero-crossing 


5. CONCLUSION 

This paper addressed the binary classification of sleep stages using recorded ECG data. There were 
10 subjects involved in this study. The SVM method was used to differentiate the awake state from the sleep 
state. The optimization method of grid search was then used to improve the classification performances by 
automatically finding the best parameters for the SVM method. Finally, the results of the optimized SVM were 
validated using accuracy, precision, and recall. The results showed that the optimized SVM obtained an average 
accuracy of 85.46% precision 84.05% and recall 85.44%. These results noticed that our optimized SVM using 
grid search could improve the classification performance of sleep stages. Moreover, we investigated all the ten 
main features using feature selection of information gain in order to get the most significant features to the 
algorithm performance namely crest factor, standard deviation, and kurtosis. 
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APPENDIX 
Table 1. The average accuracy, precision and recall of all scenarios 
Method Feature Accuracy (%) Precision (%) _ Recall (%) 
SVM mean, standard deviation, and variance 77.73 77.82 77.36 
SVM+grid search mean, standard deviation, and variance 84.03 82.69 84.03 
SVM crest factor, standard deviation, kurtosis, energy, mean, 82.60 80.07 82.59 
skewness, variance, spectral flux, entropy, and zero 
crossing 
SVM+grid search crest factor, standard deviation, kurtosis, energy, mean, 85.46 84.05 85.44 
skewness, variance, spectral flux, entropy, and zero 
crossing 
SVM crest factor, standard deviation, and kurtosis 83.27 74.93 83.24 
SVM+grid search crest factor, standard deviation, and kurtosis 84.83 76.17 84.83 
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